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We consider the possibility of encoding to classical bits into much fewer n quantum bits so that an 
arbitrary bit from the original to bits can be recovered with a good probability, and we show that non- 
trivial quantum encodings exist that have no classical counterparts. On the other hand, we show that 
' quantum encodings cannot be much more succint as compared to classical encodings, and we provide a 

\ lower bound on such quantum encodings. Finally, using this lower bound, we prove an exponential lower 

CNI ■ bound on the size of 1-way quantum finite automata for a family of languages accepted by linear sized 

deterministic finite automata. 
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"^f ■ 1 Introduction 
O 

tJ- . 

The tremendous information processing capabilities of quantum mechanical systems may be attributed to the fact 
that the state of an n quantum bit (qubit) system is given by a unit vector in a 2™ dimensional complex vector space. 
Since 2™ — 1 complex numbers are necessary to completely specify the state of n quantum bits, it may appear that it 
is possible to encode a lot of information into it. Nonetheless, a fundamental result in quantum information theory — 
Holevo's theorem ^] — states that no more than n classical bits of information can be transmitted by transferring n 
quantum bits from one party to another. In view of this result, it is tempting to conclude that the exponentially 
many degrees of freedom latent in the description of a quantum system must necessarily stay hidden or inaccessible. 

However, the situation is more subtle since in quantum mechanics, the recipient of the n qubits has a choice of 
measurements he can make to extract information about their state. In general, these measurements do not commute. 
Thus making a particular measurement will, in general, disturb the system, thereby destroying some or all the 
information that would have been revealed by another possible measurement. This opens up the possibility of 
quantum random access encodings. Say we wish to encode to classical bits b\ ■ ■ ■ b m into n quantum bits (to S> n). 
Then a quantum random access encoding with parameters to, n,p (or simply an to & n encoding) consists of an 
encoding map from {0, l} m to (D 2 , together with a sequence of to possible measurements for the recipient. If the 
recipient chooses the zth measurement and applies it to the encoding of b = b\ . . . b m , the result of the measurement 
is bi with probability at least p. 

Definition 1.1 A to n random access encoding is a function f : {0, l}" l xi? i— > (D 2 such that for every 1 < i < to. 
there is a measurement Oi that returns or 1 and has the property that 

V6 E {0, l} m : Prob( O t \f(b, r)) = h ) > p. 

We call f the encoding function, and Oi the decoding functions. 
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Notice that given the n qubits corresponding to a random access encoding of some m bits, the recipient cannot simply 
make all m measurements and retrieve the encoded bits (thus violating Holevo's bound), since any measurement 
disturbs the state vector. A priori, there is no reason to rule out the existence of a c™ & n encoding for constants c > 1 , 
p > 1/2. In fact, even though (D fe can accommodate only k mutually orthogonal unit vectors, it can accommodate c k 
almost mutually orthogonal unit vectors (i.e. vectors such that the dot product of any two has absolute value less 
than 1/10, say). This might lead one to believe that such encodings exist. If such quantum random access encodings 
were possible, it would be possible to, for instance, encode the contents of an entire telephone directory in a few 
quantum bits such that the recipient of these qubits could, via a suitably chosen measurement, look up any single 
telephone number of his choice. 

The main question that we consider in this paper is: for what values of m, n and p do m i— > n encodings exist? 
For classical encodings, where we encode m classical bits into n classical bits, we know the answer. Let, for p € 
[0, 1], H (p) = — plogp — (1 — p) log(l — p) denote the binary entropy function. We show: 

Theorem 1.1 For any p > 1/2, there exist m A- n classical encodings with n = (1 — H(p))m + O(logm), and 
any m n classical encoding has n > (1 — H{p))m. 

We then show that quantum encodings are more powerful than classical encodings. On the one hand, we show that 
no classical encoding can encode two bits into one bit with decoding success probability greater than 0.5, and on 
the other hand, we exhibit a 2 1 quantum encoding. In fact, as Ike Chuang Q has shown, it is possible to 
encode 3 bits into 1 qubit with success probability « 0.79 by taking advantage of the fact that the amplitudes in 
quantum states can be complex numbers. The 2-into-l quantum encoding and the 3-into 1 encoding easily generalize 
to a 2n a ^ n and a 3n n encoding, respectively. However, the question as to whether quantum encodings can 



asymptotically beat the classical lower bound of Theorem 1.1 is left open. Our main result about quantum encodings 



is that they cannot be much smaller than the encoded strings. 

Theorem 1.2 If a m & n quantum encoding exists with p > | a constant, then n > Q( lo " m ). 

Thus, even though quantum random access encodings can beat classical encodings, they cannot be much more 
succint. 

We finish the paper with a novel application of our bound to showing a lower bound on the size of 1-way quantum 
finite automata (QFAs). (See Section [O] for a precise definition of 1-way QFAs.) In [|l0| it was shown that not every 
language recognized by a (classical) deterministic finite automaton (DFA) can be recognized by a 1-way QFA. On the 
other hand, there are languages that can be recognized by 1-way QFAs with size exponentially smaller than that of 
corresponding classical automata |2j . It remained open whether, for any language that can be recognized by a 1-way 
finite automaton both classically and quantum-mechanically, we can efficiently simulate the classical automaton by a 
1-way QFA. Our result answers this question in the negative, and demonstrates that while in some cases one is able 
to exploit quantum phenomena to construct highly space-efficient 1-way QFAs, in others, as it will become apparent, 
the requirement of the unitarity (or, in other words, reversibility) of evolution seriously limits their efficiency. 

Theorem 1.3 Let {L n } n >i be a family of languages defined by L n = {wa \ w € {a, b}* , \w\ < n}. Then, 

1. L n is recognized by a 1-way deterministic automaton of size 0(n), 

2. L n is recognized by some 1-way quantum finite automaton, and, 

S. Any 1-way quantum automaton recognizing L n with some constant probability greater than i has 2°(' l / lo s n ) 
states. 
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2 The classical bounds 



We first prove a lower bound on the number of bits required for a classical random access encod ing, and then show 



that there are classical encodings that nearly achieve this bound. Together, these yield Theorem 1.1 of the previous 
section. 

The proof of the lower bound involves the concepts of the Shannon entropy S(X) of a random variable X, the 
Shannon entropy S(X\Y) of a random variable X conditioned on another random variable Y, and the mutual 
information I(X : Y) of a pair of random variables X, Y. For definitions and basic facts involving these concepts, 
we refer the reader to a standard text (such as ) on information theory. 



Theorem 2.1 Let 1/2 < p < 1. For any classical m i— > n encoding, n > (1 — H(p))m. 



Proof: Suppose there is such a (possibly probabilistic) encoding /. Let X = X\ - ■ ■ X m be chosen unformly at 
random from {0, 1}"\ and let Y — f(X) E {0, 1}™ be the corresponding encoding. Let Z be the random variable 
with values in {0, l} m obtained by generating the bits Z\ ■ ■ ■ Z m from Y using the m decoding functions. 

The mutual informaion of X and Y is clearly bounded by the number of bits in Y, i.e. n: 

I(X : Y) < S{Y) < n. 

We show below that it is, in fact, lower bounded by (1 — H{p))m, thus getting our lower bound. 
Now, 

I(X:Y) = S(X)-S(X\Y) = m-S(X\Y). 
But, using standard properties of the entropy function, we have 



S(X\Y) < S(X\Z) < J2S(Xi\Z) < J2S{Xi\Zi 



It is not difficult to see that S(X,\Z t ) < H(p). It follows that S(X\Y) < H(p)m, and that I(X : Y) > (1 - H(p))m, 
as we intended to show. ■ 

We now give an almost matching upper bound: 



Theorem 2.2 There is a classical m & n encoding with n = (1 — H(p))m + O(logm) for any p > ^. 

Proof: The encoding is trivial for p > 1 — -. We describe the encoding for p < 1 — - below. 

We use a code S C {0, l} m such that, for every x S {0, l} m , there is a y € {0, l} m within Hamming distance (1 — 
p — ^i)m. It is known (see, e.g., |(|) that there is such a code S of size 

jgj _ 2( 1 ~ ff (P+™) m + 21o s m < 2( 1 ~- ff (p)) m + 41o s m 

Let S(x) denote the codeword closest to x. One possibility is to encode a string x by S(x). This would give us an 
encoding of the right size. Further, for every x, at least (p + — )m out of the m bits would be correct. This means 
that the probability (over all bits i) that Xi = S(x)i is at least p + l/m. However, for our encoding we need this 
probability to be at least p for every bit, not just on average over all bits. This can be achieved with the following 
modification. 

Let r be an m-bit string, and tt be a permutation of {1, ...,m}. For a string x S {0, 1}"\ let n(x) denote the 
string aj 7r(1) a; w (2) • ■ -x^ m y 

We consider encodings SV.r defined by S^ ir (x) — Tr^ 1 (S(ir(x + ?')) + r. We show that if tt and r are chosen uniformly 
at random, then for any x and any index i, the probability that the ith bit in the encoding is different from x^ is at 
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Figure 1: A 2-into-l quantum encoding with probability of success w 0.85. 



most 1 — p — 1/m. First, note that if i is also chosen uniformly at random, then this probability is clearly bounded 
by 1 — p — 1/m. So all we need to do is to show that this probability is independent of i. 

If it and r are uniformly random, then ir(x+r) is uniformly random as well. Furthermore, for a fixed y = n(x+r) , there 
is exactly one r corresponding to any permutation tt that gives y — tt(x + r). Hence, if we condition on y — ir(x + r), 
all 7r (and, hence, all 7r _1 (i)) are equally likely. This means that the probability that Xi ^ S 1T . r (x)i (or, cquivalcntly, 
that ir(x + r) n -i^ ^ (S(n(x + r)) n -i^) for random tt and r is just the probability of yj ^ S(y)j for random y and j. 
This is clearly independent of i (and x). 

Finally, we show that there is a small set of pcrmuation-string pairs such that the desired property continues to hold 
if we choose 7T, r uniformly at random from this set, rather than the entire space of permutations and strings. We 
employ the probabilistic method to prove the existence of such a small set of permutation-string pairs. 

Let £ — to 3 , and let the strings n,...,re € {0, l} m and permutations m,...,ire be chosen independently and 
uniformly at random. Fix x € {0, l} m and i € [l..m]. Let Xj be 1 if x^ ^ S nj . rj (x)i and otherwise. Then Y^j=i Xj is 

a sum of t independent Bernoulli random variables, the mean of which is at most (1 —p—l/m)£. Note that j Xj 
is the probability of encoding the ith bit of x erroneously when the permutation-string pair is chosen uniformly 
at random from the set {(ni,ri), . . . (7T£,r^)}. By the Chernoff bound, the probability that the sum Y^j=i Xj is 
at least (1 — p — l/m)£ + to 2 (i.e., that the error probability j Xj mentioned above is at least 1 — p) is 

bounded by e" 2 " 1 I 1 = c" 2 ™. Now, the union bound implies that the probability that the ith bit of x is encoded 
erroneously with probability more than 1 — p for any x or i is at most m2 m e~ 2m < 1. Thus, there is a combination 
of strings n, . . . ,rg and permutations m, . . . ,ng with the property we seek. We fix such a set of £ strings and 
permutations. 

We can now define our random access code as follows. To encode x, we select j € {1, . . . ,£} uniformly at random 
and compute y — S^ jjrj {x). This is the encoding of x. To decode the ith bit, we just take t/j. For this scheme, we 
need log(^'|5'|) = \og£ + log |5| = (1 — H(p))m + 71ogm bits. This completes the proof of the theorem. ■ 

3 A gap between quantum and classical encodings 

In this section, we show, by exhibiting an encoding which has no classical counterpart, that quantum encodings give 
us some advantage over classical encodings. 

Lemma 3.1 There is a 2 1 quantum encoding. 
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Figure 2: A geometric characterization of the probabilistic decoding functions of Lemma 3.i 



Proof: Let uq = |0), u\ = |1), and vq = -^=(|1) + |0)), v% — ( 1 1) — |0)). Define f(xi,x 2 ), the encoding of the 
string X1X2 to be u xi + v X2 normalized (See Figure 1). The decoding functions are defined as follows: for the first 
bit Xi, we measure the message qubit according to the u basis and associate uq with x\ — and u\ with x\ = 1. 
Similarly, for the second bit, we measure according to the v basis, and associate vo with x 2 = and v\ with x 2 = 1. 

It is easy to verify that for all four codewords, and for any i = 1, 2, the angle between the codeword and the right 
subspace is tt/8. Hence the success probability is cos 2 (7r/8) « 0.853. ■ 

Lemma 3.2 No 2h> 1 classical encoding exists for any p > ^. 

Proof: Suppose there is a classical 2 1— > 1 encoding for some p > |. Let /: {0, l} 2 x R 1— > {0, 1} be the corresponding 
probabilistic encoding function and Vi : {0, 1} x ]?' h {0, 1} the probabilistic decoding functions. If we let yi be the 
random variable Vi(f(x, r), r'), then for any x G {0, l} 2 , and any i S {1, 2}, Prob r!r '(?/i = Xi) > p. 

We first give a geometric characterization of the decoding functions. Each Vi clearly depends only on the encoding, 
which is either or 1. Define the point P- 7 (for j = 0, 1) in the unit square [0, l] 2 as P J = (ag,a{), where a\ — 
Prob r '( Vi(j, r') — 1 ). The point P° characterizes the decoding functions when the encoding is 0, and P 1 characterizes 
the decoding functions when the encoding is 1. For example, P 1 = (1,1) means that given the encoding 1, the 
decoding functions return y\ = 1 and y 2 = 1 with certainty, and P° = (0, 1/4) means that given the encoding 0, the 
decoding functions return y\ = and, with probability 1/4, that y 2 = 1. 

Any string x = x\x 2 € {0, l} 2 is encoded as a with some probability p x and as a 1 with some probability 1 — p x . 
If we let P x = (ag, af), where af is the probability that yi = 1, then P x = p x P° + (1 — p x )P 1 - Thus, P x lies on the 
line connecting the two points P° and P 1 . On the other hand, for the encoding to be a valid 2-into-l encoding, the 
point P x should lie strictly inside the quarter of the unit square [0, l] 2 closest to (x\, x 2 ). 

Now, the line connecting P° and P 1 intersects the interiors of only three of the four quarters of the unit square [0, l] 2 . 
For instance, if P° and P 1 are as above, then the line connecting them does not pass through the lower right quarter 
(see Figure 2). Thus, for the string xix 2 which is favored by that quarter (e.g. the string x — 10 in the example 
above), either V\ or V 2 errs with probability at least a half — which is a contradiction. ■ 

4 The quantum lower bound 

We now prove Theorem |l.2| . We first show that the success probability of the decoding process can be amplified at 
the cost of a small increase in the length of the random access code. 
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Lemma 4.1 If for a constant p > ^ there is an m A n encoding, then there is also an m I—* 6 O(nlogi) encoding 
for any e = e(m) > 0. 

Proof: Suppose there is an encoding / : {0, l} m xfinC 2 with decoding algorithms Oi (i — 1,. .. , m) with 
success probability p > 1/2. We define a new encoding /W : {0, l} m x J2* i— > (C 2 )* as f®(x, rx, . . . , r t ) = 
f(x,ri) (g) • • • (g) f(x,r t ). I.e., it is the tensor product of t independent identical copies of the original code. The 
new decoding functions 0\ consists of applying Oi to each of the t independent copies of the code, and answering 
according to the majority. The Chernoff bound shows that the error probability decays exponentially fast in the 
number of trials, and is therefore at most e when t is chosen to be 0(log \). ■ 

By choosing e = l/q(m) for some polynomial q, we achieve an encoding with error e at the cost of using an O(logm) 
factor more qubits for the encoding. Now the result of any measurement cannot perturb the state vector too much 
(i.e. by more than y/e). It might seem that this is sufficient to give us the lower bound, since we need to make 
only m measurements to recover all m encoded bits, and the error per measurement is only l/poly(m). However, the 
situation is more subtle, since the error on subsequent measurements must take into account both the encoding error, 
as well as the error introduced by previous measurements. In fact, a straightforward analysis suggests that the error 
doubles with each measurement, thus making such a proof infeasible. Instead, we prove that the errors grow linearly 
(rather than exponentially), by first invoking the principle of safe storage (see J5J) to defer all measurements to the 
end of a sequence of unitary operations, and then bounding the errors in the computation via a hybrid technique 
from H (which is made more explicit in |p^|). 

Lemma 4.2 If a m i— > n quantum encoding with e < exists, then n > f2(m). 

Proof: We first deal with deterministic quantum encoding, in which the encoding function / : {0, l}™ 1 i— > (D 2 maps 
inputs to pure states. Any such encoding has, for every i S [l..m], a decoding function which takes a codeword \<p) 
and an ancilla applies a unitary transformation Vi, and makes a measurement. Thus, it resolves (D 2 into two 
subspaces Wf and (W®) 1 - corresponding to the answers and 1 (for the ith bit), respectively. Given \<j>, 0'), we can 
thus decompose it as \$) + | </>■), where \<$) € Wf and \$) € (W?) 1 - . 

We now apply the principle of safe storage. Instead of applying Vi and measuring, we use unitary transformations Ui 
(i = 1, . . . ,m) that work over the codeword \tj>), the ancilla |0') and m output bits |0 m ), such that Ui a) = a) 
and Ui \<p\,a) = U£,a0 e^), where ej is the vector |0, . . . , 0, 1, 0, . . .0) having a 1 entry only in the ith place. 

The transformations Ui introduce some garbage at each step, and their composition U\ • • • U m is quite messy. To 
analyse their behaviour, we first fix an input x, and imagine ideal unitary transformations U[ = U[(x) that have the 
property that for the codeword 1^) of x, U[ \<j) x , a) = \4>x, a © fai 1 &i))- Since for any x S {0, l} m and any i S [l..m], 
the transformation Ui correctly yield the ith bit of x with high probability, the reader can verify that 

||Di|^ ej , ,o)-l7/|^ X) , ) a)|| a < 2e. (1) 

We now claim that the result of applying the transformations U does not differ much from that of applying the ideal 
transformations U-. 

Claim 4.1 || Ui ■ ■ ■ U m \<f> x , l , m ) -U[--- U' m \c/> x , 0', m ) \\ < 2m^. 
Proof: We use a hybrid argument: 

|| Ui ■ ■ ■ U m \4> x , 0', m ) - U{ ■ ■ ■ U' m \cj> x , Q\ m ) || < || Ux ■ ■ ■ U m -!U m |^x, 0', m ) - U x ■ ■ ■ U m -xU' m |^, 0', o m ) || + 

|| U x ■ ■ ■ U m -iU' rn \<f> x , l ,0 m ) -Ux--- U' m _xU' m \<j> x , l , m ) || + 
••• + 

|| Ux ■ ■ ■ U' m _ x U' m |0 X , o l , o m )-u[--- u' m _ x u' m o m ) || 



G 



But, since the transformations U are unitary, we have: 

|| Ui ■ ■ ■ U t U' t+1 ■■■U' m \cf> x , l , m ) -Ui. — u' t u' t+1 —u' m \<t> x , l , o m ) || 
= || U t U' t+l ---UU \d> x , l , Q m ) - U' t U' t+1 ■■■Ul m \d> x , 0<, o m ) || 
= II u t Wt+i) - u i\4>'t+i) || , 

where = U' t+1 ■ ■ ■ U' n \<j> Xi ; , m ). By the definition of the transformations [7$, \<ftt+i) = |0a;,O',a) with a = 

|0, . . . , 0, Xt+i, . . . , x m ). Hence, by equation ([!]), || Ut \4>'t+i) ~ |^t+i) || — ^V^i an( I the claimed result follows. 
■ 

Now we can extract all the bits of x by computing \ip) = U\ . . . U m \<j) x , l , m ) and measuring the m answer 
bits ai, . . . , a m . The following claim says that we succeed with high probability. 

Claim 4.2 Prob(a ^ x) < Arriy/e. 

Proof: Let \ip') = U[...U' m \(j> x , 0', m ) = \(j) x ,0 l ,x). From the claim above, we know that || |^>) - \tp') || < 2m^. 
When we measure the answer bits of If/ 1 '), we get x with probability 1. Moreover, from the following fact, the 
probability of observing x on measuring \ip) cannot differ from this by very much. 



Fact 4.1 Suppose \\ \ipi) — \1jj2) \\ < 8. Let O be a measurement with possible results A, and T>i the classical distri- 
butions over A that result from applying O to \ipi) . Then || T>\ — T> 2 \ x == S aG A|f i( a ) — 2? 2(a)! < 2<5. 

Hence, the probability that a ^ x is at most Am^fe. ■ 

Therefore, we get x with probability at least 1 — 4my/e > 1 — |^ = |- It then follows from Holevo's Theorem |J 
that n > fi(m). 

Now we deal with probabilistic quantum encoding, where we can encode a string x £ {0, 1}™ as a probabilistic 
mixture of pure states. It is well known that we can always purify the system, i.e., we can adjoin ancilla bits to the 
encoding, such that the result is a pure state. Now, as before, we may apply the decoding transformations U and 
retrieve all the encoded bits: for every x, there are ideal transformations U[ = U[{x) that behave almost as Ui (in 
the same sense as above) and the same argument again gives us the lower bound on n. ■ 

Combining the two lemmas above, we get Theorem 



general p > 1/2, by appropriately generalizing Lemma 



1.2. We remark that we may extend this lower bound to 
4.1 above. 



4.1 Serial encodings 



We note that Theorem 1.2 holds even in a slightly more general scenario, when the decoding functions are allowed 



to depend on the string encoded. 

Definition 4.1 / : {0, l} m x R 1— > C 2 serially encodes m classical bits into n qubits with p success, if for any i £ 
and 6[;+i in ] = • • • b n £ {0, l}" - \ there is a measurement Oi^ [i+1 n] that returns or 1 and has the property 

that 

V6 S {0, l}" 1 : Prob( O i>6[<+1 . n] \f(b, r)> = h ) > p. 

I.e., we allow the decoding functions to depend on the suffix frj+i • • • b m of the string b for recovering the value 
of the ith bit bi. The lower bound for quantum random access codes of the previous section also holds for serial 
encodings. 

Theorem 4.1 Any quantum serial encoding of m bits into n qubits with constant success probability p > \ has n > 

0(^2-). 

V log ra / 
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Figure 3: A DFA that accepts the language L n = {wa \ w € {a, &}*, \ w\ < n}. 



Proof: On careful examination, we see that for the proof of Theorem 1.2 to work in this case as well, all we need 
to check is that for all i G [l-i], 



< 2e, 



where a* 



10. 



Although the transformations U% may now depend on the bits already decoded, 



the above bound is easily verified, since ai contains the required suffix of the encoded word x. 



5 The lower bound for 1-way quantum finite automata 



In this section, we give the details of the proof of Theorem 1.3. The first two parts of Theoreml.3 are easy. Figure 3 



shows a DFA with 2n + 3 states for the language L n . Also, Since each L n is a finite language, there is a 1-way 



reversible finite automaton (as defined in Section 5.1), and hence a 1-way QFA that accepts it. What then remains 
to be shown is the lower bound on the size of a 1-way QFA accepting the language. 

Intuitively, since a 1-way QFA is allowed to read input symbols only once, a QFA for L n necessarily "records" the 
last symbol read in its state, and since it is required to be reversible, it is forced to "remember" all the symbols read 
until it is clear whether the input is in the language or not. Thus, we expect the state of the automaton after n 
input symbols to be an encoding of the n symbols. It is not difficult to see that in the case of a 1-way reversible 
automaton that accepts the language L n , the encoding is such that all the n input symbols can be recovered with 
certainty. Thus, such an automaton has at least 2™ states. However, for reasons stated below, it is not clear in the 
case of a general 1-way QFA that the state encodes the input symbols in a "faithful" manner. 

• Firstly, a 1-way QFA is allowed to make partial decisions (i.e., it is allowed to accept or reject an input with 



some probability before reading all its symbols). We show in Section 5.3 that partial decisions can be "deferred" 
for r steps at a cost of only an 0{r) factor increase in the size of the automaton. We call the resulting automaton 
an r-restricted QFA. Since no input of length more than n+1 belongs to L n , this means that partial decisions 
are not very useful in building "small" automata for the language, and that we can limit our study to that 
of n-restricted QFAs. 

Secondly, and more seriously, the encoding defined by the automaton is such that each input symbol is accessible 
via a measurement only when all the symbols following it are known, and by trying to learn the later symbols 
we might destroy the encoding. 



This problem is exactly the one Theorem 4.1 solves. We can thus conclude that the number of qubits required 



to represent a state of the automaton is f2(n/logn), which gives us the lower bound stated in Theorem |T 

Before presenting the formal proof for the lower bound, we define 1-way QFAs precisely in the next section. We 
then show how a restricted QFA for the language L n yields a serial encoding of n classical bits into a state of the 
automaton. Theorem 4.1 then immediately gives a size lower bound of 2 a ("/ lo s™) for restricted QFAs. We then 



extend this lower bound to general QFAs in Section 5.3 
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5.1 Technical preliminaries 



A 1-way quantum finite automaton (QFA) is a theoretical model for a quantum computer with finite memory. It has 
a finite set of basis states Q, which consists of three parts: accepting states, rejecting states and non-halting states. 
The sets of accepting, rejecting and non-halting basis states are denoted by Qacc, Qroj and Q non , respectively. One 
of the states, go, is distinguished as the starting state. 

Inputs to a QFA are words over a finite alphabet S. We shall also use the symbols 'c" and '$' that do not belong to £ 
to denote the left and the right end marker, respectively. The set V = £ U $, } denotes the working alphabet of 
the QFA. For each symbol a G T, a 1-way QFA has a corresponding unitary transformation U a on the space C Q . A 
1-way QFA is thus defined by describing Q, Qacc, Qrcj, Qnon, <7o, £, and U a for all a G T. We will often refer to 1-way 
QFAs as simply QFAs, since we do not consider any other type of QFAs in this paper. 

At any time, the state of a QFA is a superposition of basis states in Q. The computation starts in the superposi- 
tion | go)- Then transformations corresponding to the left end marker 'j^,' the letters of the input word x and the right 
end marker '$' are applied in succession to the state of the automaton, unless a transformation results in acceptance 
or rejection of the input. A transformation corresponding to a symbol a € T consists of two steps: 

1. First, U a is applied to the current state of the automaton, to obtain the new state \ip')- 

2. Then, l^') is measured with respect to the observable -E acc © E rc j © E non , where E acc = span{|g) | q G Qacc}, 
-E re j = span{|g) | q G Q r cj}, E non — span{|g) | q G Q n on}- The probability of observing Ei is equal to the 
squared norm of the projection of \ip') onto Ei. On measurement, the state of the automaton "collapses" to 
the projection onto the space observed, i.e., becomes equal to the projection, suitably normalized to a unit 
superposition. 

If we observe E acc (or -Ercj), the input is accepted (or rejected). Otherwise, the computation continues, and 
the next transformation, if any, is applied. 

We regard these two steps together as reading the symbol a. 

A QFA M is said to accept (or recognize) a language L with probability p > \ if it accepts every word in L with 
probability at least p, and rejects every word not in L with probability at least p. 

A reversible finite automaton (RFA) is a QFA such that, for any a G T and q G Q, U a \q) — \q') for some q' G Q. In 
other words, the operator £/ CT is a permutation over the basis states; it maps each basis state to a basis state, not to 
a superposition over several states. 

The size of a finite automaton is defined as the number of (basis) states in it. The "space used by the automaton" 
refers to the number of (qu)bits required to represent an arbitrary automaton state. 



5.2 The lower bound for restricted QFAs 

Define an r -restricted 1-way QFA for a language L as a 1-way QFA that recognizes the language with probability p > 
i, and which halts with non-zero probability before seeing the right end marker only after it has read r letters of 
the input. We first show a lower bound on the size of n-restricted 1-way QFAs that accept L n . 

Let M be any n-restricted 1-way QFA accepting L n with constant probability p > i . The following claim formalizes 
the intuition that the state of M after n symbols of the input have been read is an encoding of the input string. 

Claim 5.1 There is a serial encoding of n bits into (D , and hence into flog | <5 11 qubits, where Q is the set of basis 
states of the QFA M. 



Proof: Let Q be the set of basis states of the QFA M, and let Q acc and Q lc j be the set of accepting and 
rejecting states respectively. Also, let U a be the unitary operator of M corresponding to the symbol a G {a, b, j 
and E r , 



Let E acc , -E/rcj 



be defined as in Section 5.1 
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We define an encoding / : {a, b} n — > <D® of n-bit strings into unit superpositions over the basis states of the QFA M 
by letting \f(x)) be the state of the automaton M after the input string x £ {a, b} n has been read. We assert that / 
is a serial encoding. 

To show that / is indeed such an encoding, we exhibit a suitable measurement for the zth bit of the input for 
every i £ [l,.n]. Let, for y £ {a, b} n ~ l , Vi(y) = U$U~ 1 , where U y stands for the identity operator if y is the 
empty word, and for Uy n _ i U Vn _ i _ 1 ■ ■ ■ U V1 otherwise. The ith measurement then consists of first applying the unitary 
transformation Vi (jc,-+i • • • x n ) to \f(x)) , and then measuring the resulting superposition with respect to E acc © E re j © 
E n0 n- (Note that the measurement for the ith bit assumes the knowledge of all the successive bits Xi+i, ... ,x n of 
the input.) Since for words with length at most n, containment in L n is decided by the last letter, and because such 
words are accepted or rejected by the n-restricted QFA M with probability at least p only after the entire input has 
been read, the probability of observing i? acc if Xi = a, or i? re j if Xi — b, is at least p. Thus, / defines a serial encoding, 
as claimed. ■ 



Theorem 4.1 now immediately implies that [~log|Q|] = 0(n/logn) and thus |Q| = 2°( n / lo s™), where Q is as in the 



claim above. 



5.3 Extension to general QFAs 

It only remains to show that the lower bound on the size of restricted QFAs obtained above implies a lower bound on 
the size of general QFAs accepting L n . We do this by showing that we can convert any 1-way QFA to an r-restricted 
1-way QFA which is only 0{r) times as large as the original QFA. It follows that the 2 n (™/ logn ) lower bound on 
number of states of n-restricted 1-way QFAs recognizing L n continues to hold for general 1-way QFAs for L n , exactly 



as stated in Theorem 1.3 



The idea behind the construction of a restricted QFA, given a general QFA, is to carry the halting parts of the 
superposition of the original automaton as "distinguished" non-halting parts of the state of the new automaton till 
at least r more symbols of the input have been read since the halting part was generated or until the right end 
marker is encountered, and then mapping them to accepting or rejecting subspaces appropriately. 



Lemma 5.1 Let M be a 1-way QFA with S states recognizing a language L with probability p. Then there is 
an r-restricted 1-way QFA M' with 0(rS) states that recognizes L with probability p. 



Proof: Let M be a 1-way QFA with Q as the set of basis states, Qacc as the set of accepting states, Q ve - 3 as the set 
of rejecting states, and go as the starting state. Let M' be the automaton with basis state set 

Q U (Qacc x {0, 1, . . . ,r + 1} x {acc, non}) U (Q re j X {0, 1, . . . , r + 1} x {rej, non}). 

Let Q acc U (Q acc X {0, 1, . . . , r + 1} X {acc}) be its set of accepting states, let Q ro j U (Qrej x {0, 1, . . . , r + 1} x {rej}) 
be the set of rejecting states, and let g be the starting state. If, for a state q £ Q, there is a transition 

\q) i-> J2a q ,\q') 

i' 

in M on symbol a, then in M', we have the following transitions. On the '$' symbol, we have the same transition, 
and on a ^ $, we have 

\q) i > ^2 a q'W)+ a q >\q',0,non) . 

<?'0QaccUQ rcj <?'eQaccUQ roj 

The transitions from the states not originally in M are given by the following rules. On the '$' symbol, 

{\q, i, acc) if q £ Q acc and i < r 
\q, i, rej) if q £ Q rej and i < r 
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and on a symbol a S {a, 6}, 



|g,i,non; 



|g, i + 1, non) if i < r 

|g, i + 1, acc) if q 6 Q a cc and i = r 

|<7, i + 1, rej) if q € Q re j and i = r 



The rest of the transitions may be defined arbitrarily, subject to the condition of unitarity. 

It is not difficult to verify that M' is an r-restricted 1-way QFA (of size 0(rS)) accepting the same language as M, 
and with the same probability. ■ 



5.4 Some remarks 



We observe that the size 0(n) versus size f2(2") separation between DFAs and 1-way QFAs is the worst possible if 
we restrict ourselves to languages that can be accepted by 1-way QFAs with probability of correctness that is high 
enough (at least 7/9). Such languages include all finite regular languages, since these can be accepted by 1-way 
RFAs. This follows from the result of Ambainis and Freivalds Q that any language accepted by a QFA with high 
enough probability can be accepted by a 1-way RFA which is at most exponentially bigger than the minimal DFA 
accepting the language. However, it is not clear that this is also the largest separation in the case of languages that 
are accepted by 1-way QFAs with smaller probability of correctness. 

Another open problem involves the blow up in size while simulating a 1-way probabilistic finite automata (PFA) by 
a 1-way QFA. The only known way for doing this is by simulating the PFA by a 1-way DFA and then simulating the 
DFA by a QFA. Both simulating a PFA by a DFA j|, §, H) and simulating a DFA by a QFA (this paper) can involve 
exponential or nearly exponential increase in size. This means that the straightforward simulation of a probabilistic 
automaton by a QFA (described above) could result in a doubly-exponential increase in the size. However, we do 
not know of any examples where both transforming a PFA into a DFA and transforming a DFA into a QFA cause 
big increases of size. Better simulations of probabilistic automata by QFAs may well be possible. 

In general, it is not known how to simulate a probabilistic coin-flip by a purely quantum-mechanical algorithm if 
space is limited. For example, the only known simulation of 5(rt)-space probabilistic Turing machines by S*(n)-space 
quantum Turing machines can create quantum Turing machines running in expected time of 2 2 ( ' ]l4| ] . Finding 
better simulations or proving that they do not exist is another interesting direction to explore. 
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